Data, Corpora, and Linguistic Research

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Encoding Linguistic Corpora

This paper describes the motivation and design of the Corpus Encoding Standard (CES) (Ide, et al., (1996); Ide, 1998), an encoding standard for linguistic corpora intended to meet the need for the development of standardized encoding practices for linguistic corpora. The CES identifies a minimal encoding level that corpora must achieve to be considered standardized in terms of descriptive repre...

متن کامل

Annotating Syllable Corpora with Linguistic Data Categories in XML

The usefulness of high quality annotated corpora as a development aid in computational linguistic applications is now well understood. Therefore it is necessary to have systematic, easily understandable and effective means for annotating corpora at many levels of linguistic description using. This paper presents a three step methodology for annotating speech corpora using linguistic data catego...

متن کامل

Querying Linguistic Corpora with Prolog

In this paper we demonstrate how Prolog can be used to query linguistically annotated corpora, combining the ease of dedicated declarative query languages and the flexibility of general-purpose languages. On the basis of a Prolog representation of the German Tüba-D/Z Treebank, we show how one can tally arbitrary features of (groups) of nodes, define queries that combine information from differe...

متن کامل

Representing Linguistic Corpora and Their Annotations

A Linguistic Annotation Framework (LAF) is being developed within the International Standards Organization Technical Committee 37 Sub-committee on Language Resource Management (ISO TC37 SC4). LAF is intended to provide a standardized means to represent linguistic data and its annotations that is defined broadly enough to accommodate all types of linguistic annotations, and at the same time prov...

متن کامل

Preparation and Analysis of Linguistic Corpora

The corpus is a fundamental tool for any type of research on language. The availability of computers in the 1950’s immediately led to the creation of corpora in electronic form that could be searched automatically for a variety of language features and compute frequency, distributional characteristics, and other descriptive statistics. Corpora of literary works were compiled to enable stylistic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: HERMES - Journal of Language and Communication in Business

سال: 2015

ISSN: 0904-1699

DOI: 10.7146/hjlcb.v1i2.21362